Clustering Large Sparse Text Data: A Comparative Advantage Approach
نویسندگان
چکیده
منابع مشابه
Clustering Large and Sparse Co-occurrence Data
A novel approach to clustering co-occurrence data poses it as an optimization problem in information theory — in this framework, an optimal clustering is one which minimizes the loss in mutual information. Recently a divisive clustering algorithm was proposed that monotonically reduces this loss function. In this paper we show that sparse high-dimensional data presents special challenges which ...
متن کاملDo Country Stereotypes Exist in Educational Data? A Clustering Approach for Large, Sparse, and Weighted Data
Certain stereotypes can be associated with people from different countries. For example, the Italians are expected to be emotional, the Germans functional, and the Chinese hard-working. In this study, we cluster all 15-year-old students representing the 68 different nations and territories that participated in the latest Programme for International Student Assessment (PISA 2012). The hypothesis...
متن کاملA Comparative Study of Some Clustering Algorithms on Shape Data
Recently, some statistical studies have been done using the shape data. One of these studies is clustering shape data, which is the main topic of this paper. We are going to study some clustering algorithms on shape data and then introduce the best algorithm based on accuracy, speed, and scalability criteria. In addition, we propose a method for representing the shape data that facilitates and ...
متن کاملLarge Scale Sparse Clustering
Large-scale clustering has found wide applications in many fields and received much attention in recent years. However, most existing large-scale clustering methods can only achieve mediocre performance, because they are sensitive to the unavoidable presence of noise in the large-scale data. To address this challenging problem, we thus propose a large-scale sparse clustering (LSSC) algorithm. I...
متن کاملDo Country Stereotypes Exist in PISA? A Clustering Approach for Large, Sparse, and Weighted Data
Certain stereotypes can be associated with people from different countries. For example, the Italians are expected to be emotional, the Germans functional, and the Chinese hard-working. In this study, we cluster all 15-year-old students representing the 68 different nations and territories that participated in the latest Programme for International Student Assessment (PISA 2012). The hypothesis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Information Processing
سال: 2010
ISSN: 1882-6652
DOI: 10.2197/ipsjjip.18.242